Transmission image reconstruction and imaging using poissonian detector data

ABSTRACT

An image reconstruction method for reconstructing an image  f   min  representing a region of investigation within an object includes providing detector data (y i ) having Poisson random values from an X-ray transmission measurement using an X-ray source and a detector device, the detector data (y i ) being measured at an i-th of a plurality of different pixel positions of the detector device relative to the object, and reconstructing the image  f   min  based on the detector data (y i ), the reconstructing step including a procedure of minimizing a functional 
     
       
         
           
             
               F 
                
               
                 ( 
                 
                   f 
                   _ 
                 
                 ) 
               
             
             = 
             
               
                 
                   1 
                   k 
                 
                  
                 
                   
                     ∑ 
                     
                       i 
                       = 
                       1 
                     
                     k 
                   
                    
                   
                       
                   
                    
                   
                     ( 
                     
                       
                         μ 
                         i 
                       
                       - 
                       
                         
                           y 
                           i 
                         
                          
                         log 
                          
                         
                             
                         
                          
                         
                           μ 
                           i 
                         
                       
                     
                     ) 
                   
                 
               
               + 
               
                 a 
                  
                 
                   
                      
                     
                       
                         T 
                         
                           - 
                           1 
                         
                       
                        
                       
                         f 
                         _ 
                       
                     
                      
                   
                   p 
                 
               
             
           
         
       
     
     where  f  is a current test image used for minimizing the functional F( f ). The image  f   min  represents the global minimum of F( f ).

ACKNOWLEDGEMENT OF GOVERNMENT SUPPORT

The work leading to this invention has received funding from theEuropean Atomic Energy Community's Seventh Framework Program([FP7/2007-2011]) under grant agreement n^(o) FP7-212100.

FIELD OF THE INVENTION

The invention relates to an image reconstruction method forreconstructing an image of an object based on detector data contaminatedwith Poisson noise, especially detector data collected with an X-raytransmission device, like a computed tomography (CT) imaging device or aplanar X-ray imaging device, operating e.g. under low photon-fluxconditions. In particular, the image reconstruction method is based on anon-analytical image reconstruction using a minimizing algorithm.Furthermore, the invention relates to an imaging method including theimage reconstruction method. Furthermore, the invention relates to animaging device implementing the imaging and image reconstructionmethods.

BACKGROUND OF THE INVENTION

In the present specification, reference is made to the followingpublications illustrating prior art of conventional image reconstructionand imaging techniques.

-   [1] David Donoho. For most large underdetermined systems of linear    equations, the minimal l¹ norm near-solution approximates the    sparsest near-solution. Communications on Pure and Applied    Mathematics, 59(7):907-934, 2006;-   [2] Jarvis Haupt and Robert Nowak. Signal reconstruction from noisy    random projections. IEEE Trans. on Information Theory,    52(9):4036-4048, 2006;-   [3] Maxim Raginsky, Rebecca M. Willett, Zachary T. Harmany, and    Roummel F. Marcia. Compressed sensing performance bounds under    poisson noise. IEEE Trans. on Signal Processing, 58(8):3990-4002,    2010;-   [4] Emmanuel Candès and Justin Romberg. Sparsity and incoherence in    compressive sampling. Inverse Problems, 23(3):969-985, 2007;-   [5] C. C. Craig. On the Tchebychef inequality of Bernstein. Ann.    Math. Statist., 4(2):94-102, 1933;-   [6] J. M. Noras. Some formulas for moments of the Poisson    distribution. Phys. Rev. B, 22(12):6474-6475, 1980;-   [7] Emmanuel J. Candès et al. “An Introduction To Compressive    Sampling” in “IEEE Signal Processing Magazine” March 2008, p. 21-30;    and-   [8] Rebecca M. Willett et al. “Performance bounds on compressed    sensing with Poisson noise” in “IEEE International Symposium on    Information Theory ISIT” 2009, Jun. 28, 2009-Jul. 3, 2009, p.    174-178; and-   [9] PCT/EP2010/000526 (not published at the priority date of the    present specification).

The reconstruction of a three-dimensional CT image is often accomplishedby the well-known filtered back projection (FBP) algorithm, which iswell suited for conventional CT imaging. However, this algorithm hassome drawbacks. First, it requires an ideal measurement, i.e. the anglesunder which the patient is viewed must be many, and they should beequidistant. Second, the FPB does not work easily with noise. The lattermeans that the image acquisition must be done in such a way as tominimize noise, which requires a high dose. An alternative to FBP is toreconstruct the image iteratively using standard techniques such asMaximum Likelihood Expectation Maximization (MLEM). This method has theadvantage that the imaging setup is in principle relatively arbitrary:for example, angles do not have to be equidistant (although they oftenare). Furthermore, the Poissonian nature of the noisy measurements isbuilt into the method. This type of algorithm is becoming more and morefeasible with the amount of computer power currently available.Nevertheless, it is still necessary to record a lot of information inorder to obtain a good reconstruction in the end.

In contrast to this, the concept of compressive sensing (CS) promises,in principle, that a similarly good reconstruction can be obtained withsignificantly less recorded information. It relies on the idea thatmost, if not all, real world objects are “compressible” if they arerepresented in a suitable basis system. Compressible means that thecoefficients of the representation, when ordered by size, fall off fastwith some power of their index. The information that an object iscompressible can be used in addition to the relatively few recorded datapoints to obtain a reliable reconstruction. With more details,compressed sensing (or: compressive sensing, compressive sampling andsparse sampling), is a technique for acquiring and reconstructing asignal utilizing the prior knowledge that it is sparse or compressible.An introduction to CS has been presented by Emmanuel J. Candès et al.[7]. The CS theory shows that sparse signals, which contain much lessinformation than could maximally be encoded with the same number of dataentries, can be reconstructed exactly from very few measurements innoise free conditions.

According to Jarvis Haupt et al. [2], a set of data f*_(j) of size v(j=1, . . . , v) which is sparse (or rather “compressible”) can beaccurately reconstructed from a small number k of random projectionseven if the projections are contaminated by noise of constant variance,e.g. Gaussian noise. Specifically, y_(i)=Σ_(j)φ_(ij)f*_(j)+ξ_(i) withi=1, . . . , k are the noisy projections of f*_(j), taken with theprojection matrix φ_(ij) which consists of random entries all drawn fromthe same probability distribution with zero mean and variance 1/v (suchthat the transformed values y_(i) have the same order of magnitude thanthe original ones) and the noise ξ_(i) is drawn from a Gaussianprobability distribution with zero mean and variance σ². By finding theminimizer {circumflex over (f)}_(j) of a certain functional (to be shownin detail below), one obtains an approximation to f*_(j) for which theaverage error is bounded by a constant times (k/log v)^(−a) with 0<a≦1,i.e. the error made depends only logarithmically on v. To put it anotherway, the error can be made small by choosing k/log v large, but it is byno means necessary to have k/v close to 1. Accurate reconstruction ispossible even if the number of projections is much smaller than v, aslong as k/log v is large.

A crucial point in the derivation of the above result is the fact thatthe variance of the noise ξ_(i) is a constant. Even though a similarresult could be obtained for non-Gaussian noises (provided certain noiseproperties required for the validity of the Craig-Bernstein inequalitycan be proved), the result does not easily carry over to the case thatthe variance of the noise depends on the values f*_(j). Yet this isprecisely what happens e.g. in photon limited imaging systems where amain source of noise is the discrete quantum nature of the photons. Inthis case the projections y_(i) have Poisson statistics with parameterμ_(i)=Σ_(j)φ_(ij)f*_(j). This parameter is equal to the mean of y_(i)but also to its variance.

In the past, it was tested whether the principle of accuratereconstructions from few projections carries over to Poisson noise inorder to make accurate reconstructions possible with fewer measurements,e.g. in emission tomography. It was expected that the compressivesensing strategy for the reconstruction of sparse or compressibleobjects from few measurements is difficult to apply to data corruptedwith Poisson noise, due to the specific properties of Poisson statisticsand the fact that measurements can not usually be made randomly, as inmany other cases.

Rebecca M. Willett et al. [8] have generalized results from Jarvis Hauptet al. to Poisson noise. It was proposed to reconstruct a tomographicimage from detector data using a procedure of minimizing a functional{circumflex over (f)} depending on a sensing matrix A and the detectordata and further depending on a penalty term, wherein the sensing matrixA is constructed on the basis of statistic Rademacher variables and thepenalty term depends on the sparsity of the object. However, the resultwas discouraging: it was found that the upper bound on the errorincreases with the number of measurements, i.e. more measurements seemto make the accuracy smaller. Thus, it was assumed that compressivesensing ordinarily only works with noise types which have a fixedvariance.

An image reconstruction method is proposed in [9] for reconstructing anemission tomographic image of a region of investigation within an objectfrom detector data comprising Poisson random values measured at aplurality of different positions relative to the object. The positionsare the spatial locations of measuring the detector data, e.g. theposition of a detector element (pixel) of a SPECT detector device at thetime of collecting data with this detector device under its currentangular position with respect to the object and under its distance fromthe centre of rotation, or as a further example, spatial locations ofdetector elements sensing data at the lines of response (LOR's) ofcoincidence events measured with a PET detector device. According to[9], a predetermined system matrix assigning the voxels of the object tothe detector data is provided. Furthermore, the tomographic image isreconstructed by minimizing a functional depending on the detector dataand the system matrix and additionally including a sparse or compressiverepresentation of the object in an orthogonal basis. The orthogonalbasis is an orthogonal matrix, the columns of which are orthonormalbasis vectors. The orthobasis is selected such that the object, inparticular the region of investigation, can be represented in theorthobasis fulfilling the requirements of sparsity or compressibility.Contrary to the above approach of Rebecca M. Willett et al. [8] using arandom sensing matrix, elements of the system matrix are not statisticvalues but rather selected by geometrically or physically assigningcontributions of each of the voxels (object data) to each detectorelement of a detector device or the associated detector data, resp. Thesystem matrix which defines the detector data as linear combinations ofthe original object data is determined by geometric or physical featuresof the imaging device, in particular by the arrangement of the objectrelative to the detector device and the detector device geometry.

OBJECTIVE OF THE INVENTION

The objective of the invention is to provide an improved imagereconstruction method for reconstructing an image of an object, inparticular for transmission tomography purposes, which is capable ofavoiding disadvantages of conventional techniques. In particular, theobjective is to provide an image reconstruction method which enablesreconstructing the transmission image with a reduced quantity ofdetector data (such as for instance fewer angles for CT orsuper-sampling for planar X-ray images) without a loss in image quality.Furthermore, the objective of the invention is to provide an improvedimaging method avoiding disadvantages of conventional imagingtechniques. Furthermore, the objective of the invention is to provide animproved imaging device in particular being adapted for conducting theinventive imaging method.

SUMMARY OF THE INVENTION

The above objectives are solved by an image reconstruction method, animaging method and/or an imaging device comprising the features of theindependent claims. Advantageous embodiments of the invention aredefined in the dependent claims.

According to a first aspect of the invention, an image reconstructionmethod is proposed for reconstructing an image f ^(min) representing aregion of investigation within an object. Firstly, detector data (y_(i))are provided which comprise Poisson random values from an X-raytransmission measurement using an X-ray source and a detector device,said detector data (y_(i)) being measured at an i-th of a plurality ofdifferent pixel positions of the detector device relative to the object.Subsequently, the image f ^(min) is reconstructed based on the detectordata (y_(i)) using a procedure of minimizing the functional F(f)

${F\left( \underset{\_}{f} \right)} = {{\frac{1}{k}{\sum\limits_{i = 1}^{k}\; \left( {\mu_{i} - {y_{i}\log \mspace{14mu} \mu_{i}}} \right)}} + {a{{T^{- 1}\underset{\_}{f}}}_{p}}}$

whereinf is a current test image used for minimizing the functional F(f),

$\frac{1}{k}{\sum\limits_{i = 1}^{k}\; \left( {\mu_{i} - {y_{i}\mspace{11mu} \log \mspace{11mu} \mu_{i}}} \right)}$

is a maximum-likelihood risk functional for Poisson statistics, saidparameters μ_(i) being transmission projections of the test image f,said projections being computed according to Beer-Lambert's law at thei-th pixel position relative to the X-ray source,|T⁻¹ f|_(p) is a sparsity enforcing functional including the l_(p) normof vector T⁻¹ f with 0≦p<2, said vector T⁻¹ f being a sparse orcompressive representation of f in a (bi-)orthogonal basis T, anda is a calibration factor. The image f ^(min) to be reconstructed is theglobal minimum of the functional F(f).

The inventors have found that for the reconstruction of X-raytransmission images, the functional to be minimized for finding thereconstructed image, is defined depending on the Beer-Lambert's law (or:Beer's law). The Beer-Lambert's law states that the total attenuation isequal to the exponential of the negative integral of the localattenuation coefficients over the radiation path through the object.Accordingly, there is a logarithmic dependency between the transmissionof radiation through the object and the product of the local attenuationcoefficient of the object and the distance the radiation travels througha volume element, summed over the volume elements along the path.

The inventors have shown that the image reconstruction works withPoissonian noise, which does not have a fixed variance (the variance ofa Poisson variable x with average value Ex=λ is equal to E(x²)−E²(x)=λand thus signal dependent). The invention thus allows for reducing thenumber of angles and dose by making use of the compressive nature of theobject (the patient) and by modelling and incorporating the noisy natureof the measurements, resulting in an accurate reconstruction with a highsignal to noise ratio.

Optionally, the functional F(f) additionally can include an additiveregularization function R(f):

${F\left( \underset{\_}{f} \right)} = {{\frac{1}{k}{\sum\limits_{i = 1}^{k}\; \left( {\mu_{i} - {y_{i}\; \log \mspace{14mu} \mu_{i}}} \right)}} + {a{{T^{- 1}\underset{\_}{f}}}_{p}} + {{R\left( \underset{\_}{f} \right)}.}}$

As an advantage, the regularization function R can be constructed forsuppressing artefacts in the reconstructed tomographic image.

As a main advantage, the invention provides a new method to accuratelyreconstruct a three or two dimensional image from very few detectordata, e.g. projections, by non-analytical or algebraic methods using aminimizing algorithm. The inventors have found methods of exploiting thefact that real world objects are usually sparse or at least compressiblewhile overcoming the restrictions of conventional approaches forapplying CS on Poisson noise data. X-ray transmission images can bereconstructed with a number of measurements which is reduced comparedwith conventional measurements, whereas the same accuracy as withconventional methods can be obtained. Accordingly measurement timeand/or applied activity (for emission tomography) or irradiation dose(for transmission tomography) can thus be reduced.

For conventional reconstructions there are (at least) three problems.First, the data are noisy, second, it is necessary to perform a largenumber of measurements in order to obtain sufficient information for anacceptable reconstruction quality and third, the reconstruction problemis ill-posed such that without further measures even an infinitesimalamount of noise generates serious artifacts. The invention is capable ofovercoming all three difficulties by providing a link between acompressive sensing strategy (solving problem two) and a maximumlikelihood approach (which deals optimally with problem one) for thecase of Poisson noise, while still allowing for inclusion ofregularization techniques addressing problem three.

According to the invention, the image f ^(min) is reconstructed byminimizing the functional in particular including a sparse orcompressive representation of the object in a (bi-) orthogonal basis T.The (bi-)orthogonal basis (or simply basis in the following) T is a(bi-) orthogonal matrix, the columns of which are basis vectors. Thebasis is selected such that the object, in particular the region ofinvestigation, can be represented in the basis fulfilling therequirements of sparsity or compressibility.

The minimizing procedure is a non-analytical or algebraic procedure,e.g. an iterative algorithm. The reconstructed image f ^(min) comprisesthe image data representing the global minimum of the functional F(f).In particular, the inventive reconstruction comprises minimizing theabove functional F(f), which includes two or optionally three additiveparts, namely an empiric risk part, which represents the Poissonstatistic of the measurement, a CS part representing the compressibilityof the object and (optionally) a regularization term.

The calibration factor a is a predetermined fixed parameter, which isnot changed during the minimization procedure, i.e. it is not a fitparameter. Further features of the calibration factor a are describedbelow. By a preliminary determination of the calibration factor a, theimage reconstruction can be improved. This is a further differencecompared with the above conventional approach of Rebecca M. Willett etal.

According to a first preferred embodiment of the invention (CTembodiment), the X-ray transmission measurement is an X-ray CT imagingof the object and the image (f_(j) ^(min)) is a volumetric(two-dimensional slices or a three-dimensional volume) reconstruction ofthe object. Using the Beer-Lambert's law, the parameters μ_(i) are

μ_(i)=Σ_(i′) B′ _(ii′) I ₀exp(−Σ_(j) A _(i′j) f _(j))+r

wherein B′_(ii′) is a matrix representing a response function of thedetector device assigning an i′-th spatial position on the detectorsurface to the i-th detector data (y_(i)),I₀ is an intensity of the unattenuated X ray beam,A_(i′j) is a predetermined system matrix assigning a j-th voxel of theobject (1) to the i′-th spatial position on the detector surface, andr is a background count parameter of the detector device.

With the CT embodiment, the predetermined system matrix A_(ij) assigninga j-th voxel of the object to the i′-th detector data is provided.Contrary to the above conventional approach of Rebecca M. Willett et al.using a random sensing matrix, elements of the system matrix A_(ij) arenot statistic values but rather selected by geometrically or physicallyassigning contributions of each of the voxels (object data) to eachdetector element of a detector device or the associated detector data,resp. The system matrix which defines the detector data as linearcombinations of the original object data is determined by geometric orphysical features of the imaging device, in particular by thearrangement of the object relative to the detector device and thedetector device geometry. In addition, influences such as focal spotsize, heel effect or body scatter may be encoded in the system matrix.

The basis T is selected to be as incoherent as possible with respect tothe system matrix A_(ij) and such that the object f (or a comparabletypical object) has a compressible representation in this basis. Thesecriteria were introduced previously e.g. by J. Candès et al. (seeabove). A particular advantage of this embodiment, which is explainedwith further details below, can be obtained by the application of thel_(p) norm with 0≦p<2, e.g. p=1. In the case p≧1, the minimizationprocedure has a single global minimum allowing an implementation with anincreased processing speed.

According to a second preferred embodiment of the invention (planarimaging embodiment), the X-ray transmission measurement is a planarX-ray imaging of the object and the image (f_(i′) ^(min)) is a planarreconstruction of an X-ray attenuation image of the object. Using theBeer-Lambert's law, the test image f_(i′) is given in terms of a threedimensional test object f_(j) ^(A) as

f _(i′) =I ₀exp(−Σ_(j) A _(i′j) f _(j) ^(A)),

A_(i′j) is a predetermined system matrix assigning a j-th voxel of thetest object to the i′-th spatial position on the detector surface, and

-   -   the parameters μ_(i) are

μ_(i)=Σ_(i′) B′ _(ii′) f _(i′) +r,

whereinB′_(ii′) is a matrix representing a response function of the detectordevice assigning an i′-th spatial position on the detector surface tothe i-th detector data (y_(i)), andr is a background count parameter of the detector device.

A further advantage of the invention is given by the fact that variousapproaches for determining (adjusting) the system matrix A_(i′j) and/orthe response function B′_(ii′) are available. As a first variant, thesystem matrix and/or the response function can be adjusted usingreference data of the measuring system used for collecting the detectordata. The reference data can be stored as specific data of the measuringsystem in the image reconstruction device, e.g. in the imaging device.According to a second variant, the entries of the system matrix and/orof the response function can be acquired using a calibrationmeasurement. In this case, collecting the detector data is performedwith a calibration object (phantom) having known volumetric or planartransmission object data. The calibration detector data can be used foradjusting the elements of the system matrix and/or of the responsefunction.

Another advantage of the invention is given by the fact that variousbases T are available. As it is known from the theory ofthree-dimensional wavelets (see A. Cohen et al. in “American Journal ofMathematics”, Volume 121, 1999, p. 587-628, and P. Bechler et al. in“Transactions of American Mathematical Society”, Volume 359 (2), 2007,p. 619-635), objects with so-called bounded variation have compressiblewavelet coefficients. This condition is indeed fulfilled for the objectsof the real world, so that a suitable orthobasis using three-dimensionalwavelets with a finite support can be found for any object to beinvestigated. Therefore, according to a referred embodiment of theinvention, T is a basis of three-dimensional wavelets with a compactcarrier. Alternatively, if a certain prior knowledge is available aboutthe object or the structure thereof, an adapted basis can be constructedwith entries depending on typical object data of the object to beimaged. In this case, the orthobasis can be matched to the type ofobject for improving the image reconstruction. As an example, if it isknown that the object is sufficiently smooth, it will have predominantlylow spatial frequency components such that a Fourier basis is adequate.Alternatively, one could choose a basis of Chebyshev polynomials in theplane combined with a one dimensional Fourier basis in the axialdirection. For real world objects a principal components analysis of anumber of reference objects may generate a suitable basis.

The image reconstruction method can be conducted immediately aftercollecting the detector data, in particular with the imaging devicecollecting the detector data. Alternatively, the image reconstructionmethod can be conducted with an image reconstruction device at a distantlocation and/or with a delay after the measurement with detector dataprovided e.g. via a data communication channel from the imaging deviceto the image reconstruction device or from a data storage. Accordingly,the detector data can be provided e.g. via a data network, from a datastorage or directly by the detector device.

Further important features of the invention may be represented byadditional steps conducted after the image reconstruction, like e.g.storing, recording, displaying and/or further image processing of thereconstructed image.

According to a second aspect of the invention, an imaging method forreconstructing an image of a region of investigation within an object isproposed, which comprises the steps of collecting the detector data withthe detector device of an imaging device and subjecting the detectordata to the inventive image reconstruction method according to the abovefirst aspect of the invention. Preferably, the imaging method is adaptedfor collecting a CT image or a planar transmission image.

According to a further, particularly preferred embodiment of theinvention, a preparation step can be provided before the collection ofthe CT detector data. The preparation step comprises a setting of animage quality measure (risk) to be fulfilled by the imaging method.Depending on the risk, the dose and the number of angular positions areselected at which detector data are collected. Accordingly, forobtaining a preview image wherein a low image quality is sufficient, areduced number of angular positions is used, while for a subsequentregular imaging with improved image quality, the dose and/or the numberof angular positions (and correspondingly detector data) is increased.

According to a third aspect of the invention, an imaging device forimaging a region of investigation in an object is proposed, whichcomprises an X-ray source, a detector device for measuring detector datacomprising Poisson random values measured at the i-th angular positionof a detector element of the detector device relative to the object, anda reconstruction device for reconstructing an image the object, whereinthe detector data are subjected to the image reconstruction methodaccording to the above first aspect of the invention.

Further independent subjects of the invention are a computer programresiding on a computer-readable medium, with a program code for carryingout the image reconstruction method according to the above first aspectof the invention, and an apparatus comprising a computer-readablestorage medium containing program instructions for carrying out theimage reconstruction method according to the above first aspect of theinvention.

Further advantages of the invention can be summarized as follows. First,the quantity of measurements for obtaining a certain image quality canbe reduced compared with the conventional imaging technique yielding thesame image quality. The measuring time and the burden for the patientcan be reduced accordingly. In particular, artefacts resulting fromunintended movements of the object, e.g. of the patient, can be avoided.Second, the image reconstruction method does not include a freeparameter. In particular, if sufficient information is available as tothe compressibility of the object data, the X-ray transmission image canbe obtained exclusively on the basis of the detector data and the systemmatrix determined by the measuring arrangement. Third, thereconstruction risk (see below) can be controlled. The quality of areconstruction can be quantitatively evaluated. The image reconstructionmethod considers Poisson statistics not only in the evaluation of therisk, but rather also with the reconstruction as such, i.e. theadvantages of the maximum-likelihood reconstruction are associated withthe advantage of a reduced number of measurements due to applying the CStechnique. Fourth, instead of only reducing the number of measurementsan appropriate combination of reduced measurements and reduced dose maybe chosen which maintains the same risk. Finally, the inventive methodcan be extended in analogy to the penalized-likelihood method for aregularization of the reconstructed object.

The invention provides further particular advantages for the imagingmethod, in particular for selecting or setting the angular positions atwhich the detector data are collected. According to a preferredembodiment of the invention, a randomized setting of the angularpositions can be provided. According to a further, alternativeembodiment of the invention, the detector data can be continuouslycollected, e.g. with a CT device while the angular positions of thedetector device thereof are changed continuously and not in the usualstep-and-shoot mode. Accordingly, the imaging speed can be essentiallyincreased.

BRIEF DESCRIPTION OF THE DRAWINGS

Further details and advantages of the invention are described in thefollowing with reference to the attached drawings, which show in:

FIGS. 1 and 2: schematic representations of imaging devices embodyingthe present invention;

FIG. 3: a schematic flowchart illustrating embodiments of the inventiveimage reconstruction and imaging methods; and

FIG. 4: experimental results illustrating the improved image qualityobtained with the present invention.

DESCRIPTION OF PREFERRED EMBODIMENTS

Preferred embodiments of the invention are described here with exemplaryreference to transmission tomography, in particular CT, and planarimaging. Details of measuring techniques, like e.g. details of CT orother X-ray transmission imaging devices and modes of operating thosedevices are not described as far as they are known from conventionaltechniques. Furthermore, the invention is not restricted to medicalimaging, but rather can also be implemented for imaging other objects,like e.g. work pieces.

Embodiments of the invention are described in the following withreference to the procedural steps of the inventive methods. Implementingthese features as data processing tools applied to raw data obtainedwith a measuring system (imaging device) will depend on e.g. themeasuring system used and the type of object to be imaged in practice.While the understanding and the implementation of the features is basedon the mathematical background outlined below, the skilled person willrecognize that the invention is not restricted to each mathematicaldetail, but covers all data processing tools generally based on thismathematical background.

1. Imaging Device

FIG. 1 schematically illustrates a schematic view of the firstembodiment (CT embodiment) of an imaging device 100 including an X-raysource 10, a detector device 20, a reconstruction device 30, an objectcarrier 40, and a carrier of the source and the detector 50. Thecomponents 10, 20, 40 and 50 are structured like in a conventional CTdevice of any generation. In particular, the detector device 20comprises a conversion layer 21 for converting X-ray radiation intooptical radiation and a detection layer 22 with an array of detectorelements 23.i (i=1, 2, 3, . . . ), which are arranged in a one- or, morecommonly, two-dimensional geometry (such as a line-shaped ormatrix-shaped array, or a plane or curved array). The detector designshown here is only for illustration; other designs (e.g., detectors witha curved surface) are possible. The source and detector carrier 50 isrotatably mounted such that it can be rotated in an x-y-plane (drawingplane) around the object carrier 40. The object carrier 40 is a supportdevice, e.g. table, configured for accommodating the object 1, e.g. apatient to be investigated. As an example, the object carrier 40comprises a carrying bed for accommodating the patient as schematicallyillustrated in FIG. 1. The source and detector carrier 50 comprises e.g.a circular rail to which the X-ray source 10 and the detector device 20are attached. The object carrier 40 and the detector carrier 50 can betranslated relative to each other in a direction perpendicular to thex-y-plane.

The object 1 is e.g. a human patient, wherein a certain portion (regionof investigation), e.g. the brain is to be investigated with theinventive imaging method. In terms of the imaging and imagereconstruction methods, the region of investigation is considered asconsisting of a plurality of voxels 2. The j-th voxel 2 is schematicallyshown for illustrative purposes.

The reconstruction device 30 comprises an input circuit 31, a processingcircuit 32, an output circuit 33, and an entry storage 34. The inputcircuit 31 is arranged for receiving the detector data (raw data) andfurther entries, which are used for implementing the inventive imagereconstruction method. To this end, the detector device 20 is connectedwith the input circuit 31 for transferring detector data y_(i) and theassociated geometrical parameters such as the table feed or the angularposition α_(l). Furthermore, the input circuit 31 is connected with theentry storage 34 being arranged for storing the system matrix A_(ij),the basis T and optionally a regularization function R. The processingcircuit 32 includes a computer unit, which is configured forimplementing the data processing of the image reconstruction method.Finally, the output circuit 33 may comprise a display 33.1, a datastorage 33.2 and/or an interface 33.3. In particular, the output circuit33 may include a Dicom-node. The reconstruction device 30 can beimplemented with a standard computer having an appropriate hardwareconfiguration, preferably including a GPU acceleration.

FIG. 1 schematically illustrates the imaging process. The attenuationcoefficient at voxel j, f_(j) ^(A), contributes with an attenuationfactor of e^(−A) ^(i′j) ^(f) ^(j) ^(A) to the average signal f_(i′) ^(B)arriving at position i′ of the conversion layer, see Eq. (1) below. Thisideal signal is further modified by the response function (modeled by amatrix B_(ii′)) of the detector system, i.e. the signal f_(i′) ^(B) ati′ adds the amount B_(ii′)f_(i′) ^(B) to the actual average signalμ*_(i) finally recorded at detector pixel i (Eq. (2) below). For CT, thetube and the detector are rotated around the patient and images aretaken for many angles.

FIG. 2 illustrates a schematic view of the second embodiment (planarimaging embodiment) of an imaging device 100 including an X-ray source10, a detector device 20, a reconstruction device 30, and an objectcarrier 40. For planar X-ray imaging, the imaging process is similar toFIG. 1 but there is only one image being taken and the tube and detectorare not rotated around the object 1.

2. Image Reconstruction and Imaging Methods

With the situation depicted in FIGS. 1 and 2, one has a (threedimensional) attenuation map f_(j) ^(A) (the subscript j indexes thevoxels, of which there are v_(A)) which is projected through X-rayprojections of the form

$\begin{matrix}{f_{i^{\prime}}^{B} = {I_{0}{\exp\left( {- {\sum\limits_{j}^{\;}\; {A_{i^{\prime}j}f_{j}^{A}}}} \right)}}} & (1)\end{matrix}$

onto v_(B) spatial positions on the conversion layer 21, where I₀ isproportional to the intensity of the unattenuated X-ray beam and A_(i′j)is the v_(B)×v_(A) system matrix, for example the discretized integralkernel of the Radon transform. Both the attenuation coefficients f_(j)^(A) and the entries of the system matrix A_(i′j) are nonnegative onphysical grounds, and naturally the f_(i′) ^(B) are nonnegative as well.However, the detection layer 22 does not measure these projectionsdirectly. First, the detector itself has a response function which ismodelled by a n×v_(B) matrix B. It can be thought of as the point spreadfunction (PSF) of the detector but can also contain other (linear)characteristics. It is, however, required that B_(ii′)≧0 for all i,i′.In addition, in practice there is always a background count rate r>0.Hence the average detector signal is given by

$\begin{matrix}{{\overset{\_}{\mu}}_{i}^{*} = {{\sum\limits_{{ii}^{\prime}}^{\;}\; {B_{{ii}^{\prime}}f_{i^{\prime}}^{B}}} + {r.}}} & (2)\end{matrix}$

Yet even these projections μ*_(i) are not directly observed, but theyact as the parameter of a Poisson process, i.e. one measures a numbery_(i) of quanta (γ photons, for instance) drawn from a Poissondistribution

${P\left( y_{i} \right)} = {^{- {\overset{\_}{\mu}}_{i}^{*}}{\frac{\left( {\overset{\_}{\mu}}_{i}^{*} \right)^{y_{i}}}{y_{i}!}.}}$

From these measurements y_(i) the original object f_(j) ^(A) is to bereconstructed as accurately as possible from as few projections aspossible. This is the situation usually seen in CT. Note that for CT,the patient is viewed from many angles and thus the index i (andlikewise i′) is in fact a double index i=(k,α) where k indexes thespatial position of the pixels on the detector and α indexes the angles.

Although the main focus is CT (FIG. 1), with a slight modification theinvention is also applicable to planar images (FIG. 2). Such planarimages could result from a single X-ray exposure but also, for instance,from whole body planar nuclear medicine measurements. The onlydifference is that for planar images one is not interested in the fullthree-dimensional information f_(j) ^(A) but only in the ideal planarprojections f_(i′) ^(B), i.e. one would only like to improve the imagesby eliminating the detector response and Poisson noise, and/or forexample by attempting to achieve superresolution. This differencebetween three-dimensional and planar reconstruction will have someconsequences in the following.

The main result of this invention is that an accurate reconstruction ofnoisy Poisson data with few measurements can be achieved by minimizing acertain functional shown below. “Few” measurements here means that ofthe n measurements the system is in principle able to make (n can bethought of as the amount of pixels measured by a conventional CT) only kare actually being made, with k<<n, thus saving time and dose. It isassumed (and it is reasonable to assume) that the object beingreconstructed is compressible in some orthogonal or biorthogonal basis,here denoted by a (bi-) orthogonal matrix T⁻¹. This could be a waveletbasis, or some other, more problem specific basis. The reconstruction isdone by finding the minimizer f of the functional

$\begin{matrix}{{F\left( \underset{\_}{f} \right)} = {{\frac{1}{k}{\sum\limits_{i = 1}^{k}\; \left( {\mu_{i} - {y_{i}\mspace{11mu} \log \mspace{11mu} \mu_{i}}} \right)}} + {a{{T^{- 1}\underset{\_}{f}}}_{p}}}} & (3)\end{matrix}$

(cf. Eq. (18) below) with (using the notation f instead of f ^(A) forconvenience)

$\begin{matrix}{\mu_{i} = {{\sum\limits_{i^{\prime}\;}^{\;}\; {B_{{ii}^{\prime}}^{\prime}I_{0}{\exp\left( {- {\sum\limits_{j}^{\;}\; {A_{i^{\prime}j}f_{j}}}} \right)}}} + r}} & (4)\end{matrix}$

in case of a CT reconstruction (FIG. 1) and, in case of planar images(FIG. 2),

$\begin{matrix}{{\mu_{i} = {{\sum\limits_{i^{\prime}}^{\;}\; {B_{{ii}^{\prime}}^{\prime}f_{i^{\prime}}}} + r}},} & (5)\end{matrix}$

this time writing f instead of f ^(B). The sum in Eq. (3) goes over thek pixels actually measured, and the y_(i) are the measured Poissonvalues. Correspondingly, the matrix B′ is equal to the full matrix Brestricted to the pixels measured.

This result is shown for the p-norm | . . . |_(p) with p=0 but can beextended to the convex case p=1 based on [1] and, on similar grounds, toother sparsity-enforcing norms with 0≦p<2. The prefactor a is known forp=0 and can be chosen empirically or self-consistently for other valuesof p.

The essential steps of the inventive image reconstruction method areschematically summarized in FIG. 3. First, the detector data with theassociated positions relative to the object 1 as well as the remaininginput entries are provided with step S1. Subsequently, thereconstructing step S2 with finding the object minimizing functional ofequation (3) is conducted, followed by step S3 of presenting thereconstructed image consisting of the minimizer of the functional.

Step S1 in particular includes the provision (step S11) of the detectordata (as measured preferably) from positions and associated angles ofthe detector elements 23. Step S11 depends on particular applicationconditions and the data source used. The detector data and positions canbe measured (step S01) and input directly from the detector device 20(FIG. 1 or 2). In this case, the procedure of FIG. 3 including the stepS01 represents an embodiment of the inventive imaging method. Measuringthe detector data is conducted as commonly known from CT or X-raytransmission devices. Optionally, selecting a number of measurements independence on the image quality to be obtained can be provided on thebasis of the considerations summarized below. Furthermore, as analternative, the detector data and positions can be input via a datacommunication channel (step S02). With this variant, steps S1 to S3represent an embodiment of the inventive image reconstruction method.

Step S1 further includes step S12 of determining a sparsity basis Twhich is suitable for the measured object, e.g. a wavelet basis withfinite support. Depending on the type of object, the basis can be storedin the entry storage 34 (FIG. 1 or 2). Furthermore, step S1 includes theprovision of a predetermined system matrix A_(ij) (step S13). The systemmatrix is computed or acquired from reference system data. Inparticular, the system matrix represents the contribution of the j-thvoxel to the i′-th detector data (FIG. 1 or 2).

With step S2, the data processing is conducted including theminimization routine, in which the three-dimensional object is sought,which realizes the global minimum of the object minimizing functional ofequation (3). Details of this functional are outlined in section 3.below.

Finally, the result of the minimization routine is output as the finalresult with step S3, where it is transferred e.g. to a Dicom-nodeincluded in the output circuit 33 (FIG. 1 or 2).

3. Mathematical Background

In the following, it is demonstrated that the disappointing andseemingly counterintuitive result presented by Rebecca M. Willett et al.is essentially due to a too restrictive notion of “error”: since thevariance of a Poissonian random variable increases when its mean valueincreases, it cannot be expected to obtain the same absolute accuracywhen f*_(j) is large as when f*_(j) is small. Yet the measure of errorused in by Rebecca M. Willett et al. was the risk

${R\left( {f^{*},f} \right)} = {\frac{1}{v}{\sum\limits_{v}\left( {f_{j} - f_{j}^{*}} \right)^{2}}}$

(up to normalisation), which does not take this into account. Note thatthis observation has nothing to do with the type of reconstructionalgorithm and lies in the nature of the Poisson statistics.

When the definition of “risk” is modified to reflect this fact, it willbe shown that the principles of compressive sensing can indeed be usefuland allow the reconstruction of f_(j)* faithfully within the boundsgiven by nature.

3.1 Preliminary Discussion and Notations

Applying the compressed sensing strategy to the above imaging task isbase don the following considerations. The conventional approach wouldbe to choose a suitable random matrix B_(ij) of size k×v_(B) withk<<<v_(B), such as for instance suggested for Gaussian noise in [2] orfor Poisson noise in [3]. This is, however, not an option in the givengeometric setup of a measurement where the system matrix is fixed. Theinventors therefore choose a different strategy, which is to take thesystem matrices A_(ij) and B_(ii′) as given and instead randomly selecta small subset of size k of the n potential measurements y_(i). This issimilar to, e.g., [4] but with the important difference that here thereis also noise. Note that this randomized selection is a mathematicaltool which allows to prove the results. In practice, measuring acompletely randomly chosen subset of the available pixels is notnecessarily practical since many pixels are combined on a singledetector and not measuring some of them would be a waste; instead,however, one could measure the object under a small subset of angles(perhaps randomized) and thus achieve a reduction in measurements. Theresult shown below is that on average, the reconstruction f obtainedwould give rise to “ideal” (i.e. noiseless) projections

$\begin{matrix}{{{\overset{\_}{\mu}}_{i} = {{\sum\limits_{i^{\prime}}{B_{{ii}^{\prime}}I_{0}{\exp\left( {- {\sum\limits_{j}{A_{i^{\prime}j}f_{j}}}} \right)}}} + r}}\;} & \left( \; {{\overset{\_}{\mu}}_{i} = {\sum\limits_{i^{\prime}}{B_{{ii}^{\prime}}f_{i^{\prime}}}}} \right)\end{matrix}$

which are very close to the true values μ*_(i) (in a sense to be definedin more detail below) even for those of the n potential measurementswhich were not actually measured but omitted. Thus one obtains a similarresult as if one had measured the full set of n potential measurements.

The inventive approach is a mixture between the methods in [2] and [4].The inventors adapt the methods to deal with noise from [2] to Poissonnoise and borrow the strategy of random measurement subsets from [4].This has the consequence that an important aspect of the usual randommatrix approach is lost. Suitable random matrices are with highprobability incoherent with any compressibility basis, which means thatthe compressibility basis is arbitrary. Here, the situation is different(see below).

The number of projections n is assumed to be of the same order as thenumber of pixels or voxels v_(A/B), i.e. n=O(v_(A/B)), such that onecould expect to get a good reconstruction with ordinary methods (i.e.without using compressed sensing). Only the small random subset iεΩ⊂{1,. . . , n} of the potential projections is actually measured. The set Ωhas size k, and it will be shown in the end that k may be much smallerthan n while still allowing for accurate reconstruction. Thus the actualprojections are written as

$\begin{matrix}{{\mu_{i}^{*} = {\sum\limits_{j}{P_{ij}{\overset{\_}{\mu}}_{j}^{*}}}},{i = 1},\ldots \mspace{14mu},k} & (6)\end{matrix}$

where P_(ij) is a k×n random selection matrix of all 0s, except forexactly one 1 at a uniformly distributed random position in each row,with different rows being independent. For each of these randomprojections one measures a noisy Poisson variable y_(i),i=1, . . . , k.Note that this definition allows, in principle, that multiplemeasurements of the same projections are made. The probability of thishappening is however small when n>>k.

In order to measure the accuracy of the eventual reconstruction thefollowing definition of risk is used. A potential three-dimensionalreconstruction f would produce ideal projections

$\begin{matrix}{{{\overset{\_}{\mu}}_{i} = {{\sum\limits_{i^{\prime}}{B_{i\; i^{\prime}}I_{0}{\exp\left( {- {\sum\limits_{j}{A_{i^{\prime}j}f_{j}}}} \right)}}} + {r\mspace{14mu} {and}\mspace{14mu} {has}\mspace{14mu} {risk}\mspace{14mu} {R\left( \underset{\_}{f} \right)}\mspace{14mu} {with}}}}{{R\left( \underset{\_}{f} \right)} = {\frac{1}{n}{\sum\limits_{i = 1}^{n}{\left( {{\overset{\_}{\mu}}_{i} - {{\overset{\_}{\mu}}_{i}^{*}{\log \left( {\overset{\_}{\mu}}_{i} \right)}}} \right)\mspace{14mu} \text{=:}\mspace{14mu} \frac{1}{n}{\sum\limits_{i = 1}^{n}{{R_{i}\left( {\overset{\_}{\mu}}_{i} \right)}.}}}}}}} & (7)\end{matrix}$

The definition R_(i)( μ _(i)):= μ _(i)− μ*_(i) log ( μ _(i)) will beused below. For the case of planar projections the risk is the same butinstead of the volumetric reconstruction it depends on the planarreconstruction f via

${\underset{\_}{\mu}}_{i} = {{\sum\limits_{i^{\prime}}{B_{{ii}^{\prime}}f_{i}\mspace{14mu} {and}\mspace{14mu} {\overset{\_}{\mu}}_{i}^{*}}} = {\sum\limits_{i^{\prime}}{B_{{ii}^{\prime}}{f_{i}^{B}.}}}}$

For ease of reading, the three-dimensional formulation will be used inthe following unless it makes a difference, in which case both variantswill be spelt out explicitly.

This definition of risk is modeled after the Maximum Likelihoodfunctional derived from Poisson statistics and honours the fact that theuncertainty in estimating a Poisson parameter μ is of order √{squareroot over (μ)} and not a constant. The risk is minimal when μ= μ*, whichis the case when f=f ^(A/B). In addition, the “excess risk” defined by

r( f,f *):=R( f )−R( f *)  (8)

will be useful. It is easy to show that if the background count rater=0, the excess risk is a linear function of the intensity I₀. This alsoholds approximately if I₀ r. Hence the excess risk is a measure for theabsolute error in the projections (doubling the averaged measuredprojection values by doubling the intensity I₀ results in twice theexcess risk). More important, however, is the relative error, which isr(f,f ^(A/B))/I₀. In the end one will obtain an upper bound for r(f,f^(A/B)), and the relative error can be decreased by increasing I₀.

This definition of risk is not immediately useful since it requirescomplete knowledge of f ^(A/B), which needs to be estimated in the firstplace. One therefore needs to estimate the risk from the data available,and this leads to the empirical risk {circumflex over (R)}(f):

$\begin{matrix}{{\hat{R}\left( \underset{\_}{f} \right)} = {\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {\mu_{i} - {y_{i}{\log \left( \mu_{i} \right)}}} \right)}}} & (9)\end{matrix}$

with

$\mu_{i} = {\sum\limits_{j}{P_{ij}{{\overset{\_}{\mu}}_{j}.}}}$

Again, this is nothing but the Maximum Likelihood functional of Poissonstatistics. The function {circumflex over (R)}(f) is an unbiasedestimator of R(f) since on average

$\begin{matrix}{{E\; {\hat{R}\left( \underset{\_}{f} \right)}} = {\sum\limits_{P}{{{Prob}(P)}\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {{\sum\limits_{l = 1}^{n}{P_{il}{\overset{\_}{\mu}}_{l}}} - {\sum\limits_{l = 1}^{n}{P_{il}{\overset{\_}{\mu}}_{l}^{*}\log {\sum\limits_{m = 1}^{n}{P_{im}{\overset{\_}{\mu}}_{m}}}}}} \right)}}}} & (10) \\{\mspace{79mu} {= {{\frac{1}{n^{k}}{\sum\limits_{j_{1} = 1}^{n}\mspace{14mu} {\ldots \mspace{14mu} {\sum\limits_{j_{k} = 1}^{n}{\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {{\overset{\_}{\mu}}_{j_{i}} - {{\overset{\_}{\mu}}_{j_{i}}^{*}{\log \left( {\overset{\_}{\mu}}_{j_{i}} \right)}}} \right)}}}}}} = {{R\left( \underset{\_}{f} \right)}.}}}} & (11)\end{matrix}$

The average is denoted by E . . . and includes both the average over thePoisson statistics of the y_(i), to be done first, and over the randommatrices P. The random matrices can be written as P_(ij)=δ_(j,j) _(i)with j₁, . . . , j_(k) independent random integers drawn with uniformprobability from the set {1, . . . , n}, and the probability of drawinga particular matrix P is thus

${{Prob}(P)} = {\frac{1}{n^{k}}.}$

Inserting this in Eq. (10) leads to Eq. (11).

Analogously to above, the empirical excess risk is defined by

{circumflex over (r)}( f,f ^(A/B))={circumflex over (R)}( f)−{circumflex over (R)}( f ^(A/B)).  (12)

Below the compressibility property of the object f ^(A/B) in somesparsity basis will be needed. Let T⁻¹ be an orthogonal or biorthogonalmatrix the rows of which are filled with the basis vectors of thesparsity basis. Then, taking into account only objects f ^(A/B) forwhich the coefficients in the sparsity basis, θ ^(A/B)=T⁻¹ f ^(A/B),obey

|θ_(k) ^(A/B) |≦S√{square root over (v_(A/B))}k^(−1/q),  (13)

where S>0 and q>0 are constants and the θ_(k) ^(A/B) are ordered bysize, it can easily be concluded that the error of the best m-termapproximation θ ^((m)) to θ ^(A/B) is bounded by

$\begin{matrix}{{\frac{1}{v_{A/B}}{{{\underset{\_}{\theta}}^{(m)} - {\underset{\_}{\theta}}^{A/B}}}^{2}} = {{{\frac{1}{v_{A/B}}{\sum\limits_{k = {m + 1}}^{v_{A/B}}{\theta_{k}^{A/B}}^{2}}} \leq {S^{2}{\int_{m}^{v_{A/B}}{{x}\; x^{{- 2}/q}}}} \leq {\frac{S^{2}}{\frac{2}{q} - 1}m^{1 - {2/q}}}} = {\frac{S^{2}}{2\; \alpha}m^{{- 2}\; \alpha}}}} & (14)\end{matrix}$

with

$\alpha = {\frac{1}{q} - {\frac{1}{2}.}}$

Later, it will also be necessary to estimate the error as measured bythe l₁ norm, and by a similar calculation one obtains

$\begin{matrix}{{{{\underset{\_}{\theta}}^{(m)} - {\underset{\_}{\theta}}^{A/B}}}_{1} = {{\sum\limits_{k = {m + 1}}^{v_{A/B}}{\theta_{k}^{A/B}}} \leq {\frac{S\sqrt{v_{A/B}}}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}}}} & (15)\end{matrix}$

In order for this to be useful, it is necessary require that

${\alpha > \frac{1}{2}},$

such that 0<q<1, i.e. the objects considered need to be reasonablystrongly compressible.

For the “real-space” object f ^((m))=Tθ ^((m)) it follows that

$\begin{matrix}{{\frac{1}{v}{{{\underset{\_}{f}}^{(m)} - {\underset{\_}{f}}^{A/B}}}^{2}} = {{\frac{1}{v}{{T\left( {{\underset{\_}{\theta}}^{(m)} - {\underset{\_}{\theta}}^{A/B}} \right)}}^{2}} \leq {\frac{1}{v}{T}^{2}{{{\underset{\_}{\theta}}^{(m)} - {\underset{\_}{\theta}}^{A/B}}}^{2}} \leq {{T}^{2}\frac{S^{2}}{2\; \alpha}m^{{- 2}\; \alpha}}}} & (16)\end{matrix}$

with the l₂ matrix norm ∥·∥. Likewise it can be shown that this implies

$\begin{matrix}{{\frac{1}{v_{A/B}}{{\underset{\_}{f}}^{A/B}}^{2}} = {{\frac{1}{v_{A/B}}{{T\; {\underset{\_}{\theta}}^{A/B}}}^{2}} \leq {{T}^{2}D^{2}}}} & (17)\end{matrix}$

with some positive constant D, i.e. the objects f ^(A/B) come from aball of radius ∥T∥D√{square root over (v_(A/B))}.

3.2 Oracle Inequality

Although the risk R(f) is not directly applicable, it is very usefulsince it allows one to derive an “oracle inequality” which states thatthe average empirical risk ER(f) of a reconstruction (obtained in a wayto be described below) taken from a candidate set is, roughly speaking,within a constant factor of the ideal risk R(f _(min)) of the bestapproximation f _(min) in the candidate set.

This means that even if one had complete knowledge of f ^(A/B) and couldthus find the optimal approximation in the candidate set, this optimalapproximation would be only marginally better than what can be foundwithout this prior knowledge.

Following [2] very closely, it will be shown that for

$\begin{matrix}{{{\underset{\_}{\hat{f}}}_{\min} = {\arg \; {\min\limits_{\underset{\_}{f}}\left( {{\hat{R}\left( \underset{\_}{f} \right)} + \frac{{c\left( \underset{\_}{f} \right)}\log \; 2}{\; {kò}}} \right)}}},} & (18)\end{matrix}$

where c(f) is a penalty term involving the l₀ norm denoted by | . . . |₀of the form

c( f )=const.×log(v)|T ⁻¹ f| ₀,  (19)

the risk is bounded by

$\begin{matrix}{{{Er}\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {C_{1}\left( {{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} + 4}{kò}} \right)}} & (20)\end{matrix}$

with a constant C₁ of order 1 and ò a constant to be specified later.

3.2.1. Proof of the Oracle Inequality Consider

$\begin{matrix}{{\hat{r}\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} = {\frac{1}{k}{\sum\limits_{i = 1}^{k}\left\lbrack {\mu_{i} - {y_{i}{\log \left( \mu_{i} \right)}} - \left( {\mu_{i}^{*} - {y_{i}{\log \left( \mu_{i}^{*} \right)}}} \right)} \right\rbrack}}} & (21) \\{= {:{{- \frac{1}{k}}{\sum\limits_{i = 1}^{k}{u_{i}.}}}}} & (22)\end{matrix}$

Abbreviating μ_(i)−y_(i) log (μ_(i))=ξ_(i) and μ*_(i)−y_(i) log(μ*_(i))=ξ*_(i), one has u_(i)=ξ*_(i)−ξ_(i). The u_(i) are independent,identically distributed random variables. Furthermore,

$\begin{matrix}{{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} = {\frac{1}{k}{\sum\limits_{i = 1}^{k}\left\lbrack {\underset{\underset{= {E\; \xi_{i}}}{}}{\frac{1}{n}{\sum\limits_{j = 1}^{n}\left( {{\overset{\_}{\mu}}_{j} - {{\overset{\_}{\mu}}_{j}^{*}{\log \left( {\overset{\_}{\mu}}_{j} \right)}}} \right)}} - \underset{\underset{= {E\; \xi_{i}^{*}}}{}}{\left( {\frac{1}{n}{\sum\limits_{j = 1}^{n}\left( {{\overset{\_}{\mu}}_{j}^{*} - {{\overset{\_}{\mu}}_{j}^{*}{\log \left( {\overset{\_}{\mu}}_{j}^{*} \right)}}} \right)}} \right)}} \right\rbrack}}} & (23) \\{\mspace{79mu} {{= {{- \frac{1}{k}}{\sum\limits_{i = 1}^{k}{Eu}_{i}}}},}} & (24)\end{matrix}$

so

${{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} - {r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}} = {\frac{1}{k}{\sum\limits_{j = 1}^{k}{\left( {u_{j} - {Eu}_{j}} \right).}}}$

Then the Craig-Bernstein inequality [5] can be applied which states thatthe probability of

$\begin{matrix}{{\frac{1}{k}{\sum\limits_{j = 1}^{k}\left( {u_{j} - {Eu}_{j}} \right)}} \geq {\frac{t}{kò} + {{kò}\; \frac{{Var}\left( {\frac{1}{k}{\sum\limits_{j = 1}^{k}u_{j}}} \right)}{2\left( {1 - \zeta} \right)}}}} & (25)\end{matrix}$

is less than or equal to e^(−t) for 0<òh≦ζ<1 and t>0, provided themoment condition

$\begin{matrix}{{E{{u_{j} - {E\; u_{j}}}}^{l}} \leq \frac{{l!}{{Var}\left( u_{j} \right)}h^{l - 2}}{2}} & (26)\end{matrix}$

holds for all l≧2 and some fixed h>0. It will be shown below that onecan find a certain h such that the moment condition holds for u_(j). Itwill also be seen that

${{Var}\left( {\frac{1}{k}{\sum\limits_{j = 1}^{k}u_{j}}} \right)} = {{\frac{1}{k^{2}}{\sum\limits_{j = 1}^{k}{{Var}\left( u_{j} \right)}}} \leq {\frac{C}{k}{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{*}} \right)}}}$

with a constant C. Armed with these two facts, one concludes thatappropriate ò and ζ can be found such that the probability of

$\begin{matrix}{{{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} - {r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}} \geq {\frac{t}{kò} + {ò\; \frac{C}{2\left( {1 - \zeta} \right)}{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}}}} & (27)\end{matrix}$

is less than e^(−t). One now introduces a finite candidate set F fromwhich to find a good approximation to f ^(A/B). To each f a penalty c(f)is assigned which obeys

${\sum\limits_{\underset{\_}{f} \in F}2^{- {c{(\underset{\_}{f})}}}} \leq 1$

. At this stage the candidate set and the penalty are completelyarbitrary, as long as this inequality holds. Setting δ:=e^(−t) andinserting δ(f):=2^(−c(f))δ in place of e^(−t) in Ineq. (27), theprobability that at least one fεF violates Ineq. (27) is less than

${\sum\limits_{\underset{\_}{f} \in F}{\delta \left( \underset{\_}{f} \right)}} \leq \delta$

(Boole's inequality). With probability 1−δ therefore, the inequality

$\begin{matrix}{{{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} - {r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}} \leq {\frac{{{c\left( \underset{\_}{f} \right)}\log \; 2} - {\log \; \delta}}{kò} + {ò\; \frac{C}{2\left( {1 - \zeta} \right)}{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}}}} & (28)\end{matrix}$

holds for all fεF.

Let ζ=òh and

$a:=\frac{ò\; C}{2\left( {1 - \zeta} \right)}$

. Then choosing some

$0 < ò < \frac{2}{C + {2h}}$

guarantees that 0<a<1 and ζ<1. Therefore

$\begin{matrix}{{\left( {1 - a} \right){r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}} \leq {{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( \underset{\_}{f} \right)\log \; 2} - {\log \; \delta}}{kò}}} & (29)\end{matrix}$

with probability at least 1−δ for all fεF and all 0<δ≦1.

The bound on the risk is minimal for that f which minimizes thepenalized empirical risk:

$\begin{matrix}{{\underset{\_}{\hat{f}}}_{\min} = {{\arg \; {\min\limits_{\underset{\_}{f} \in F}\left( {{\hat{r}\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( \underset{\_}{f} \right)}\log \; 2}{kò}} \right)}} = {\arg \; {\min\limits_{\underset{\_}{f} \in F}{\left( {{\hat{R}\left( \underset{\_}{f} \right)} + \frac{{c\left( \underset{\_}{f} \right)}\log \; 2}{kò}} \right).}}}}} & (30)\end{matrix}$

Let f _(min) be the element of F which minimizes the penalized (true)risk, i.e.

$\begin{matrix}{{{\underset{\_}{f}}_{\min} = {\arg \; {\min\limits_{\underset{\_}{f} \in F}\left( {{R\left( \underset{\_}{f} \right)} + \frac{{c\left( \underset{\_}{f} \right)}\log \; 2}{kò}} \right)}}},{then}} & (31) \\{{\left( {1 - a} \right){r\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}} \leq {{\hat{r}\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} - {\log \; \delta}}{kò}}} & (32)\end{matrix}$

(with probability at least 1−δ) since f _(min) minimizes the penalizedempirical risk, and replacing it by f _(min) on the right hand side canonly make the bound bigger.

Repeating the application of the Craig-Bernstein inequality for{circumflex over (r)}(f _(min),f ^(A/B))−r(f _(min),f ^(A/B)) (note thatthe sign is reversed as compared to the above discussion, but theCraig-Bernstein inequality can be employed either way) yields

$\begin{matrix}{{{\hat{r}\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} - {r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}} \leq {{{ar}\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} - \frac{\log \; \delta}{kò}}} & (33)\end{matrix}$

with probability at least 1−δ. Using Boole's inequality again toestimate the probability that Ineqs. (32) and (33) are simultaneouslysatisfied yields

$\begin{matrix}{{r\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {{\frac{1 + a}{1 - a}{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} - {2\; \log \; \delta}}{{kò}\left( {1 - a} \right)}}} & (34)\end{matrix}$

with probability at least 1−2δ.

Since for any random variable X,

E X ≤ ∫₀^(∞)t  Prob(X ≥ t),

one can choose

$X = {{{r\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} - {\frac{1 + a}{1 - a}{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}} - {\frac{{c\left( {\underset{\_}{f}}_{\min} \right)}\log \; 2}{{kò}\left( {1 - a} \right)}\mspace{14mu} {and}\mspace{14mu} \delta}} = ^{{- {kò}}\; {{t{({1 - a})}}/2}}}$

and integrate Ineq. (34) over t, which finally gives a bound forEr({circumflex over (f)} _(min),f ^(A/B)):

$\begin{matrix}{{{Er}\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {{\frac{1 + a}{1 - a}{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} + 4}{{kò}\left( {1 - a} \right)}}} & (35) \\{\leq {\frac{1 + a}{1 - a}{\left( {{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} + 4}{kò}} \right).}}} & (36)\end{matrix}$

This is the inequality announced above with

$C_{1} = {\frac{1 + a}{1 - a}.}$

3.2.2. Proof of the Moment Condition

For the proof of the moment conditions one has to deal with twodifficulties: first, the random variables are not simple independentGaussians as in [2] but Poissonians dependent on the values μ_(i) whichare themselves random variables, and second, one can not use Rademacherchaos. Rademacher chaos, by its randomization of signs, can deal withsparsity in any basis (the measurement basis is with high probabilityincoherent with any other basis). Here, it will be seen that thesparsity basis T must be incoherent with the system matrix A. This is inline with the results from [4].

One needs to show that the moment condition (26) holds for some h>0 andall l≧2. According to [2] it is sufficient to find a suitable constanth′ for all even l as this implies that the moment condition holds forall l with h=2h′ (Lemma 1 of [2]). The random variable u_(j)−Eu_(j) canbe split into two parts, u_(j)−Eu_(j)=X₁+X₂, with

X ₁:=−(μ_(j) −Eμ _(j))+μ*_(j) log(μ_(j))−Eμ* _(j) log(μ_(j))+μ*_(j) −Eμ*_(j)−(μ*_(j) log(μ*_(j))−Eμ* _(j) log(μ*_(j)))  (37)

X ₂:=(y _(j)−μ*_(j))log(μ_(j))−(y _(j)−μ*_(j))log(μ*_(j)),  (38)

each of which is a random variable with zero mean. Furthermore, EX₁X₂=0.One can therefore apply Lemma 2 of [2] and conclude that X₁+X₂ satisfiesthe moment condition (26) for all even l≧4 with constant h′=√{squareroot over (2)}(h₁+h₂) if X₁ and X₂ satisfy them separately withconstants h₁ and h₂.

Moment Condition for X₁ and X₂

A bounded random variable X with zero mean and |X|≦X_(max) satisfies themoment condition trivially with constant h_(X)=X_(max)/3. If it isobserved that

$\begin{matrix}{{{\overset{\_}{\mu}}_{i}^{*} = {{{I_{0}{\exp\left( {- {\sum\limits_{j}{A_{ij}f_{j}^{A/B}}}} \right)}} + r} \leq A_{\max}}},} & (39)\end{matrix}$

where A_(max) is the maximum dose per pixel (plus background count rate)of the CT (or X-ray or other device at hand), then μ*_(i) is upperbounded by the constant A_(max). This follows since both f_(i) ^(A/B)and A_(ij) are nonnegative. Without loss of generality we haveA_(max)>1. Note that A_(max) is a machine constant and does not dependon v_(A/B).

Since μ*_(j) is also bounded from below by r, one may write (increasingA_(max) if necessary)

$\begin{matrix}{{\overset{\_}{\mu}}_{j}^{*} \geq {\frac{1}{A_{\max}}.}} & (40)\end{matrix}$

By the same argument the same bounds hold for the candidate projectionsμ_(i). It then follows that |X₁|≦const.×A_(max) log (A_(max)), and thusthe moment condition holds.

It remains to show that the moment condition also holds for X₂. Consider

$\begin{matrix}{{E\; X_{2}^{2m}} = {\frac{1}{n}{\sum\limits_{i = 1}^{n}{{M_{2m}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{\log^{2\; m}\left( \frac{{\overset{\_}{\mu}}_{i}}{{\overset{\_}{\mu}}_{i}^{*}} \right)}}}}} & (41) \\{= {\frac{1}{n}{\sum\limits_{i = 1}^{n}{{M_{2}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{\log^{2}\left( \frac{{\overset{\_}{\mu}}_{i}}{{\overset{\_}{\mu}}_{i}^{*}} \right)}\frac{M_{2m}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{M_{2}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{{\log^{{2m} - 2}\left( \frac{{\overset{\_}{\mu}}_{i}}{{\overset{\_}{\mu}}_{i}^{*}} \right)}.}}}}} & (42)\end{matrix}$

Here M_(2m)(μ) is the 2m-th central moment of a Poisson variable withparameter μ. The second logarithm on the right hand side can beestimated by

$\begin{matrix}{{\log^{{2m} - 2}\left( \frac{{\overset{\_}{\mu}}_{i}}{{\overset{\_}{\mu}}_{i}^{*}} \right)} \leq {{\log^{{2m} - 2}\left( A_{\max}^{2} \right)}.}} & (43)\end{matrix}$

It is shown in the Appendix that the central moments obeyM_(2m)(μ)≦(2m)!(max (1,μ))^(m), such that

$\begin{matrix}{\frac{M_{2m}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{M_{2}\left( {\overset{\_}{\mu}}_{i}^{*} \right)} \leq {{\left( {2m} \right)!}\frac{\left( {\max \left( {1,{\overset{\_}{\mu}}_{i}^{*}} \right)} \right)^{m}}{{\overset{\_}{\mu}}_{i}^{*}}} \leq {{\left( {2m} \right)!}A_{\max}^{\frac{{2m} - 2}{2}}\frac{\max \left( {1,{\overset{\_}{\mu}}_{i}^{*}} \right)}{{\overset{\_}{\mu}}_{i}^{*}}}} & (44) \\{\leq {{\left( {2m} \right)!}{\sqrt{A_{\max}}}^{{2m} - 2}A_{\max}} \leq {\frac{\left( {2m} \right)!}{2}{\left( {\sqrt{2}A_{\max}} \right)^{{2m} - 2}.}}} & (45)\end{matrix}$

The last step is valid for all m≧2. Together one obtains the bound

$\begin{matrix}{{E\; X_{2}^{2m}} \leq {\frac{\left( {2m} \right)!}{2}\frac{1}{n}{\sum\limits_{i = 1}^{n}{{M_{2}\left( {\overset{\_}{\mu}}_{i}^{*} \right)}{\log^{2}\left( \frac{{\overset{\_}{\mu}}_{i}}{{\overset{\_}{\mu}}_{i}^{*}} \right)}\left( {2\sqrt{2}A_{\max}{\log \left( A_{\max} \right)}} \right)^{{2m} - 2}}}}} & (46) \\{\mspace{79mu} {{= {\frac{\left( {2m} \right)!}{2}{{Var}\left( X_{2} \right)}\left( {2\sqrt{2}A_{\max}{\log \left( A_{\max} \right)}} \right)^{{2m} - 2}}},}} & (47)\end{matrix}$

which is the moment condition for X₂.

Taken together, it is now established that X₁+X₂ obey the momentcondition with a parameter h with

h=const.×A _(max) log(A _(max)).  (48)

2. The Upper Bound for Var(u_(j))

The last task is to find an upper bound for Var(u_(j)), whereu_(j)=ξ*_(j)−ξ_(j) and ξ*_(j)=μ*_(j)−y_(j) log (μ*_(j)),ξ_(j)=μ_(j)−y_(j) log (μ_(j)). It is easy to show that the averagevalues Eu_(j) and Eu_(j) ² are given by

$\begin{matrix}{{Eu}_{j} = {{- \frac{1}{n}}{\sum\limits_{l = 1}^{n}\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)}}} & (49)\end{matrix}$

$\begin{matrix}{{Eu}_{j}^{2} = {\frac{1}{n}{\sum\limits_{l = 1}^{n}{\left\lbrack {\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)^{2} + {{\overset{\_}{\mu}}_{l}^{*}\left( {{\log \left( {\overset{\_}{\mu}}_{l} \right)} - {\log \left( {\overset{\_}{\mu}}_{l}^{*} \right)}} \right)}^{2}} \right\rbrack.}}}} & (50)\end{matrix}$

For the definition of R_(l)(f) see Sec. 2. The variance is then

$\begin{matrix}{{{Var}\left( u_{j} \right)} = {{\frac{1}{n^{2}}{\sum\limits_{l,{l^{\prime} = 1}}^{n}{\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)\left( {{R_{l^{\prime}}\left( \underset{\_}{f} \right)} - {R_{l^{\prime}}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)\left( {{n\; \delta_{{ll}^{\prime}}} - 1} \right)}}} + {\frac{1}{n}{\sum\limits_{l = 1}^{n}{{\overset{\_}{\mu}}_{l}^{*}\left( {{\log \left( {\overset{\_}{\mu}}_{l} \right)} - {\log \left( {\overset{\_}{\mu}}_{l}^{*} \right)}} \right)}^{2}}}}} & (51) \\{\leq {{\frac{n - 1}{n^{2}}{\sum\limits_{l = 1}^{n}\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)^{2}}} + {\frac{1}{n}{\sum\limits_{l = 1}^{n}{{{\overset{\_}{\mu}}_{l}^{*}\left( {{\log \left( {\overset{\_}{\mu}}_{l} \right)} - {\log \left( {\overset{\_}{\mu}}_{l}^{*} \right)}} \right)}^{2}.}}}}} & (52)\end{matrix}$

The inequality follows because each term R_(l′)(f)−R_(l′)(f ^(A/B)) isnonnegative and one may thus omit some of the terms with a negative signin the first sum by replacing nδ_(ll′)−1 by nδ_(ll′)−δ_(ll′). Since

${\frac{1}{A_{\max}} \leq {\overset{\_}{\mu}}_{l}},{{\overset{\_}{\mu}}_{l}^{*} \leq A_{\max}},$

it follows that

$\begin{matrix}{{{{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \leq {A_{\max} - \frac{1}{A_{\max}} + {2\; A_{\max}{\log \left( A_{\max} \right)}}}},} & (53)\end{matrix}$

such that

$\begin{matrix}{{\frac{n - 1}{n^{2}}{\sum\limits_{l = 1}^{n}\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)^{2}}} \leq {\left( {A_{\max} - \frac{1}{A_{\max}} + {2\; A_{\max}{\log \left( A_{\max} \right)}}} \right)\frac{1}{n}{\sum\limits_{l = 1}^{n}\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)}}} & (54) \\{\mspace{79mu} {= {\left( {A_{\max} - \frac{1}{A_{\max}} + {2\; A_{\max}{\log \left( A_{\max} \right)}}} \right){{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}.}}}} & (55)\end{matrix}$

It remains to bound the term

$\frac{1}{n}{\sum\limits_{l = 1}^{n}{{{\overset{\_}{\mu}}_{l}^{*}\left( {{\log \left( {\overset{\_}{\mu}}_{l} \right)} - {\log \left( {\overset{\_}{\mu}}_{l}^{*} \right)}} \right)}^{2}.}}$

In order to do so, consider the functions h_(y)(x)=y(log x−log y)² andg_(y)(x)=x−y log x−(y−y log y). They both attain their global minimum 0at x=y and have zero derivative there. Their second derivatives are

${{h_{y}^{''}(x)} = {{\frac{2\; y}{x^{2}}\left( {1 + {\log \; y} - {\log \; x}} \right)\mspace{14mu} {and}\mspace{14mu} {g_{y}^{''}(x)}} = \frac{y}{x^{2}}}},$

which implies

h_(y)^(″)(x) ≤ 2(1 + log  y − log  x)g_(y)^(′)(x) ≤ 2(1 + 2 log (A_(max)))g_(y)^(′)(x)  for${\frac{1}{A_{\max}} \leq x},{y \leq {A_{\max}.}}$

Therefore h_(y)(x)≦2(1+2 log (A_(max)))g_(y)(x). Consequently,

$\begin{matrix}{{\frac{1}{n}{\sum\limits_{l = 1}^{n}{{\overset{\_}{\mu}}_{l}^{*}\left( {{\log \left( {\overset{\_}{\mu}}_{l} \right)} - {\log \left( {\overset{\_}{\mu}}_{l}^{*} \right)}} \right)}^{2}}} \leq {2\left( {1 + {2\; {\log \left( A_{\max} \right)}}} \right)\frac{1}{n}{\sum\limits_{l = 1}^{n}\left( {{R_{l}\left( \underset{\_}{f} \right)} - {R_{l}\left( {\underset{\_}{f}}^{A/B} \right)}} \right)}}} & (56) \\{\mspace{79mu} {= {2\left( {1 + {2\; {\log \left( A_{\max} \right)}}} \right){{r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}.}}}} & (57)\end{matrix}$

Combined, one obtains

$\begin{matrix}{{{{Var}\left( u_{j} \right)} \leq {C\; {r\left( {\underset{\_}{f},{\underset{\_}{f}}^{A/B}} \right)}}}{with}} & (58) \\{C = {{\left( {A_{\max} + 2} \right)\left( {{2{\log \left( A_{\max} \right)}} + 1} \right)} - {\frac{1}{A_{\max}}.}}} & (59)\end{matrix}$

3.3 Application of the Oracle Inequality to Compressible Objects

A compressible object f ^(A/B) is one which can be representedaccurately by a small number of coefficients in some basis. To beprecise, if

$f_{i}^{A/B} = {\sum\limits_{j}{T_{ij}\theta_{j}^{A/B}}}$

where T_(ij) is an biorthogonal matrix and θ_(j) ^(A/B) are thecoefficients, and if f ^((m)) is the best m-term approximation to f^(A/B) in this basis, then as shown above the error

$\begin{matrix}{{\frac{1}{v_{A/B}}{{{\underset{\_}{f}}^{(m)} - {\underset{\_}{f}}^{A/B}}}^{2}} \leq {{T}^{2}\frac{S}{2\; \alpha}m^{{- 2}\; \alpha}}} & (60)\end{matrix}$

decays at least as fast as m^(−2α) if f ^(A/B) is compressible. Suchcompressible objects also lie within a ball of radius ∥T∥D√{square rootover (v_(A/B))} with some constant D>0.

From the oracle inequality it is known that

${{{Er}\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {C_{1}\left( {{r\left( {{\underset{\_}{f}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} + \frac{{c\left( {\underset{\_}{f}}_{\min} \right)\log \; 2} + 4}{k\; ò}} \right)}},$

where f _(min) is that vector from the candidate set F which minimizesthe risk. This can be reformulated in terms of the coefficient vectors θ^(A/B) of f ^(A/B), θ of f and θ _(min) of f _(min):

$\begin{matrix}{{{Er}\left( {{\underset{\_}{\hat{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {{C_{1}\left( {{r\left( {{T\; {\underset{\_}{\theta}}_{\min}},{T\; {\underset{\_}{\theta}}^{A/B}}} \right)} + \frac{{{c\left( {T\; {\underset{\_}{\theta}}_{\min}} \right)}\log \; 2} + 4}{k\; ò}} \right)}.}} & (61)\end{matrix}$

The candidate set F is chosen to consist of all f which satisfy thebounds f_(i)≧0 and |f∥≦∥T∥D√{square root over (v_(A/B))}, arecompressible and which have a representation f=Tθ where the coefficientsθ_(i) are uniformly quantized to v_(p) levels. Let T be the set ofcoefficient vectors θ corresponding to the set F. The penalty is chosento be

c(T θ)=(1+p)log(v _(A/B))|θ∥₀,  (62)

i.e. equal to the l₀-norm (the number of non-zero elements) of θ, up toa factor. This penalty satisfies

${{\sum\limits_{\underset{\_}{\theta} \in T}2^{- {c{(\underset{\_}{\theta})}}}} \leq 1},$

which is called Kraft inequality in this context. It will also bedenoted by c(θ) (instead of c(Tθ)) for brevity.

Let θ ^((m)) be the coefficient vector corresponding to the best m-termapproximation to f ^(A/B), i.e. f ^((m))=Tθ ^((m)), and let θ _(q)^((m))εT be the closest element to θ ^((m)) in T. One can now replace θ_(min) in the oracle inequality by θ _(q) ^((m)) since

${{r\left( {{T\; {\underset{\_}{\theta}}_{\min}},{T\; {\underset{\_}{\theta}}^{A/B}}} \right)} + \frac{{c\left( {\underset{\_}{\theta}}_{\min} \right)\log \; 2} + 4}{k\; ò}} \leq {{r\left( {{T\; {\underset{\_}{\theta}}_{q}^{(m)}},{T\; {\underset{\_}{\theta}}^{A/B}}} \right)} + \frac{{c\left( {\underset{\_}{\theta}}_{q}^{(m)} \right)\log \; 2} + 4}{k\; ò}}$

by definition of θ _(min).

Next one can estimate r(Tθ _(q) ^((m)),Tθ ^(A/B)) by noting that

$\begin{matrix}{{r\left( {{T\; {\underset{\_}{\theta}}_{q}^{(m)}},{T\; {\underset{\_}{\theta}}^{A/B}}} \right)} = {{\frac{1}{n}{\sum\limits_{k = 1}^{n}{{\overset{\_}{\mu}}_{k}^{*}\left( {\frac{{\overset{\_}{\mu}}_{k}}{{\overset{\_}{\mu}}_{k}^{*}} - 1 - {\log \; \frac{{\overset{\_}{\mu}}_{k}}{{\overset{\_}{\mu}}_{k}^{*}}}} \right)}}} \leq {\frac{1}{n}{\sum\limits_{k = 1}^{n}{{\max \left( {1,\frac{{\overset{\_}{\mu}}_{k}^{*}}{{\overset{\_}{\mu}}_{k}}} \right)}{{{{\overset{\_}{\mu}}_{k}^{*} - {\overset{\_}{\mu}}_{k}}}.}}}}}} & (63)\end{matrix}$

This follows from the easily provable property

${x - 1 - {\log \; x}} \leq {{\max \left( {x,\frac{1}{x}} \right)} - 1}$

for all x>0. One must now distinguish between three-dimensional andplanar imaging. For three-dimensional reconstruction one has

$\begin{matrix}{{{\overset{\_}{\mu}}_{k}^{*} = {{I_{0}{\sum\limits_{i^{\prime}}{B_{{ki}^{\prime}}{\exp\left( {- {\sum\limits_{jl}{A_{i^{\prime}j}T_{jl}\theta_{l}^{A}}}} \right)}}}} + r}},} & (64)\end{matrix}$

while for planar imaging

$\begin{matrix}{{\overset{\_}{\mu}}_{k}^{*} = {{\sum\limits_{i^{\prime}l}{B_{{ki}^{\prime}}T_{i^{\prime}l}\theta_{l}^{B}}} + {r.}}} & (65)\end{matrix}$

Analogous expressions hold for μ _(k) by replacing θ_(l) ^(A/B) byθ_(q,l) ^((m)).

In the former case one has for μ*_(k)≧ μ _(k)

$\begin{matrix}{{0 \leq {{\overset{\_}{\mu}}_{k}^{*} - {\overset{\_}{\mu}}_{k}}} = {I_{0}{\sum\limits_{i^{\prime}}{B_{{ki}^{\prime}}{\exp\left( {- {\sum\limits_{j\; l}{A_{i^{\prime}j}T_{jl}\theta_{l}^{A}}}} \right)}\left( {1 - {\exp\left( {- {\sum\limits_{jl}{A_{i^{\prime}l}{T_{jl}\left( {\theta_{q,l}^{(m)} - \theta_{l}^{A}} \right)}}}} \right)}} \right)}}}} & (66) \\{\mspace{79mu} {\leq {I_{0}{\sum\limits_{i^{\prime}}{B_{{ki}^{\prime}}{\exp\left( {- {\sum\limits_{jl}{A_{i^{\prime}j}T_{jl}\theta_{l}^{A}}}} \right)}{{\sum\limits_{jl}{A_{i^{\prime}j}{T_{jl}\left( {\theta_{q,l}^{(m)} - \theta_{l}^{A}} \right)}}}}}}}}} & (67)\end{matrix}$

and for μ*_(k)< μ _(k)

$\begin{matrix}{{0 \leq {{\overset{\_}{\mu}}_{k}^{*} - {\overset{\_}{\mu}}_{k}}} = {I_{0}{\sum\limits_{i^{\prime}}{B_{{ki}^{\prime}}{\exp\left( {- {\sum\limits_{j\; l}{A_{i^{\prime}j}T_{jl}\theta_{q,l}^{(m)}}}} \right)}\left( {1 - {\exp\left( {- {\sum\limits_{jl}{A_{i^{\prime}l}{T_{jl}\left( {\theta_{l}^{A} - \theta_{q,l}^{(m)}} \right)}}}} \right)}} \right)}}}} & (68) \\{\mspace{79mu} {{\leq {I_{0}{\sum\limits_{i^{\prime}}{B_{{ki}^{\prime}}{\exp\left( {- {\sum\limits_{jl}{A_{i^{\prime}j}T_{jl}\theta_{q,l}^{(m)}}}} \right)}{{\sum\limits_{jl}{A_{i^{\prime}j}{T_{jl}\left( {\theta_{l}^{A} - \theta_{q,l}^{(m)}} \right)}}}}}}}},}} & (69)\end{matrix}$

both of which follows from the relation 1−exp (−x)≦x for all x. Notethat the part within the modulus bars will be estimated by a constantbelow and the remaining parts in Eqs. (67) and (69) are simply equal toμ*_(k)−r and μ _(k)−r, respectively.

In the latter case, on the other hand, one has

$\begin{matrix}{{{{\overset{\_}{\mu}}_{k}^{*} - {\overset{\_}{\mu}}_{k}}} = {{{\sum\limits_{i^{\prime}l}{B_{{ki}^{\prime}}{T_{i^{\prime}l}\left( {\theta_{l}^{B} - \theta_{q,l}^{(m)}} \right)}}}}.}} & (70)\end{matrix}$

In both cases, one needs to bound an expression of the form

$\begin{matrix}{{{{\sum\limits_{jl}{A_{i^{\prime}j}{T_{jl}\left( {\theta_{l}^{A} - \theta_{q,l}^{(m)}} \right)}}}} \leq {\sum\limits_{l}{{{\sum\limits_{j}{A_{i^{\prime}j}T_{jl}}}}{{\theta_{l}^{A} - \theta_{q,l}^{(m)}}}}} \leq {\max\limits_{rs}{{({AT})_{rs}}{\sum\limits_{l}{{\theta_{l}^{A} - \theta_{q,l}^{(m)}}}}}}},} & (71)\end{matrix}$

with the matrix B in place of A and θ_(l) ^(B) instead of θ_(l) ^(A) forthe planar case. At this stage it becomes clear that the incoherence

$\begin{matrix}{J_{A} = {\max\limits_{rs}{{({AT})_{rs}}\mspace{20mu} {{resp}.}}}} & (72) \\{J_{B} = {\max\limits_{rs}{({BT})_{rs}}}} & (73)\end{matrix}$

between the system matrix A (resp. B) and the sparsity basis T entersthe problem. Bounding the remaining term

$\sum\limits_{l}{{\theta_{l}^{A/B} - \theta_{q,l}^{(m)}}}$

is easy, due to the compressibility property and the fact that θ ^((m))is the best m-term approximation to θ*. The bound is given by

$\begin{matrix}{{\sum\limits_{l}{{\theta_{l}^{A/B} - \theta_{q,l}^{(m)}}}} \leq {\sum\limits_{l}\left( {{{\theta_{l}^{A/B} - \theta_{l}^{(m)}}} + {{\theta_{l}^{(m)} - \theta_{q,l}^{(m)}}}} \right)}} & (74)\end{matrix}$

$\begin{matrix}{\leq {{\frac{\sqrt{v_{A/B}}S}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}} + {\frac{2\; D}{v_{A/B}^{p - {3/2}}}.}}} & (75)\end{matrix}$

Since it is exponentially small, the second term on the right hand sideis irrelevant and will be omitted in the following.

With these results, it is finally possible to bound the excess risk by

$\begin{matrix}{{r\left( {{T\; {\underset{\_}{\theta}}_{q}^{(m)}},{T\; {\underset{\_}{\theta}}^{A}}} \right)} \leq {\frac{1}{n}{\sum\limits_{k = 1}^{n}{{\max \left( {1,\frac{{\overset{\_}{\mu}}_{k}^{*}}{{\overset{\_}{\mu}}_{k}}} \right)}{\max \left( {{{\overset{\_}{\mu}}_{k} - r},{{\overset{\_}{\mu}}_{k}^{*} - r}} \right)}\frac{J_{A}\sqrt{v_{A}}S}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}}}}} & (76)\end{matrix}$

for three-dimensional imaging and

$\begin{matrix}{{r\left( {{T\; {\underset{\_}{\theta}}_{q}^{(m)}},{T\; {\underset{\_}{\theta}}^{B}}} \right)} \leq {\frac{1}{n}{\sum\limits_{k = 1}^{n}{{\max \left( {1,\frac{{\overset{\_}{\mu}}_{k}^{*}}{{\overset{\_}{\mu}}_{k}}} \right)}\frac{J_{B}\sqrt{v_{B}}S}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}}}}} & (77)\end{matrix}$

for the planar case. This can be further bounded by

$\begin{matrix}{{r\left( {{T\; {\underset{\_}{\theta}}_{q}^{(m)}},{T\; {\underset{\_}{\theta}}^{A/B}}} \right)} \leq {A_{\max}^{\beta_{A/B}}\frac{J_{A/B}\sqrt{v_{A/B}}S}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}}} & (78)\end{matrix}$

with β_(A):=3 and β_(B):=2. Combined,

$\begin{matrix}{{{Er}\left( {{\hat{\underset{\_}{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {{C_{1}\begin{pmatrix}{{A_{\max}^{\beta_{A/B}}\frac{J_{A/B}\sqrt{v_{A/B}}S}{\alpha - \frac{1}{2}}m^{\frac{1}{2} - \alpha}} +} \\\frac{{{c\left( {\underset{\_}{\theta}}^{(m)} \right)}\log \; 2} + 4}{k\overset{\backprime}{o}}\end{pmatrix}}.}} & (79)\end{matrix}$

These error bounds are minimal when

$\begin{matrix}{m = {\left( \frac{\left( {1 + p} \right)\log \; v_{A/B}\log \; 2}{k\overset{\backprime}{o}A_{\max}^{\beta_{A/B}}J_{A/B}\sqrt{v_{A/B}}S} \right)^{- \frac{1}{\alpha + \frac{1}{2}}}.}} & (80)\end{matrix}$

Inserting this in Eq. (79) one obtains

$\begin{matrix}{{{{Er}\left( {{\hat{\underset{\_}{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {C_{1}{C_{2}^{A/B}\left( \frac{\left( {J_{A/B}\sqrt{v_{A/B}}} \right)^{\frac{1}{\alpha - \frac{1}{2}}}\log \mspace{14mu} v_{A/B}}{k} \right)}^{\frac{{2\alpha} - 1}{{2\alpha} + 1}}}}{with}} & (81) \\{C_{2}^{A/B} = {{\frac{{2\alpha} + 1}{{2\alpha} - 1}\left( {A_{\max}^{\beta_{A/B}}S} \right)^{\frac{1}{\alpha + \frac{1}{2}}}\left( \frac{\left( {1 + p} \right)\log \; 2}{\overset{\backprime}{o}} \right)^{\frac{{2\alpha} - 1}{{2\alpha} + 1}}} + {\frac{4}{\overset{\backprime}{o}}.}}} & (82)\end{matrix}$

Inequality (81) with the factors C₂ ^(A/B) from Eq. (82) follows if onesassumes that

${\left( {J_{A/B}\sqrt{v_{A/B}}} \right)^{\frac{1}{\alpha - \frac{1}{2}}}\log \mspace{14mu} v_{A/B}} \geq 1.$

Recall that one may choose any

${0 < \overset{\backprime}{o} < \frac{2}{C + {2h}}} = {{O\left( \frac{1}{A_{\max}\log \mspace{14mu} A_{\max}} \right)}.}$

. In order to keep the constant C₁ close to 1, ò should not be chosentoo close to the upper bound. For instance, the choice

$\begin{matrix}{\overset{\backprime}{o} = {\frac{1}{4}\frac{2}{C + {2h}}}} & (83)\end{matrix}$

is sensible since it implies

$1 < C_{1} < {\frac{5}{3}.}$

When the object f ^(A/B) is truly sparse, i.e. only m coefficients θ_(i)^(A/B) are nonzero, the term |θ ^((m))−θ ^(A/B)|=0, so the only relevantterm is the penalty term. One then obtains the result

$\begin{matrix}{{{Er}\left( {{\hat{\underset{\_}{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)} \leq {{C_{1}\left( {{\left( {1 + p} \right)\log \; 2} + 4} \right)}\frac{1}{\overset{\backprime}{o}}{\frac{m\; \log \; v_{A/B}}{k}.}}} & (84)\end{matrix}$

3.4 Summary

The important consequence of Eq. (81) can be summarized for the relativeerror of the reconstruction {circumflex over (f)} _(min),

${{Er}\frac{\left( {{\hat{\underset{\_}{f}}}_{\min},{\underset{\_}{f}}^{A/B}} \right)}{I_{0}}} \leq {{{const}.} \times \frac{1}{I_{0}}{\left( \frac{\left( {J_{A/B}\sqrt{v_{A/B}}} \right)^{\frac{1}{\alpha - \frac{1}{2}}}\log \; v_{A/B}}{k} \right)^{\frac{{2\alpha} - 1}{{2\alpha} + 1}}.}}$

This means that one can either set an acceptable limit for the relativeerror and adjust the dose (i.e., I₀) and the number of pixels measured,k, to get below this limit, or one can choose I₀ and k first andcalculate the reliability afterwards. The constant appearing in thisequation depends only on A_(max), the maximal dose the device is capableof, and the compressibility properties of the object, namely theexponent α. The optimal situation is when J_(A/B), the incoherencebetween the system matrix and the sparsity basis T, is minimal, i.e. oforder v_(A/B) ^(−1/2). However, this cannot be guaranteed since thesystem matrix is fixed and the sparsity basis can only be varied to acertain extent (such that the object remains compressible in thisbasis). But as long as J_(A/B)˜v_(A/B) ^(−γ) with some exponent γ>1−α,the expression in brackets can still be small even if k <<v_(A/B).

3.5 Central Moments of the Poisson Distribution

According to [6] and references therein, the central moments of thePoisson distribution M_(n)(μ) obey the following recursion relation:

$\begin{matrix}{{M_{n}(\mu)} = {{{\mu \left( {n - 1} \right)}!}{\sum\limits_{i = 0}^{n - 2}{\frac{M_{i}(\mu)}{{i!}{\left( {n - 1 - i} \right)!}}.}}}} & (85)\end{matrix}$

In order to show that for all n and μ≧0, M_(n)(μ)≦n!(max (1,μ))^(n/2),one can start by showing that this is the case for n=0, 1, 2:

M ₀(μ)=1≦(max(1,μ))⁰=1  (86)

M ₁(μ)=0≦(max(1,μ))^(1/2)  (87)

M ₂(μ)=μ≦2(max(1,μ))¹.  (88)

Proceeding by induction and assuming that M_(i)(μ)≦i!(max (1,μ))^(1/2)holds for all 0≦i<n yields

$\begin{matrix}{{M_{n}(\mu)} = {{{\mu \left( {n - 1} \right)}!}{\sum\limits_{i = 0}^{n - 2}\frac{M_{i}(\mu)}{{i!}{\left( {n - 1 - i} \right)!}}}}} & (89) \\{\leq {{\mu \left( {n - 1} \right)}{\sum\limits_{i = 0}^{n - 2}{\frac{\left( {n - 2} \right)!}{{i!}{\left( {n - 1 - i} \right)!}}{i!}\left( {\max \left( {1,\mu} \right)} \right)^{i/2}}}}} & (90) \\{\leq {{\mu \left( {n - 1} \right)}{\sum\limits_{i = 0}^{n - 2}{\frac{\left( {n - 2} \right)!}{{i!}{\left( {n - 1 - i} \right)!}}\left( {\max \left( {1,\mu} \right)} \right)^{{({n - 2})}/2}}}}} & (91) \\{\leq {{\mu \left( {n - 1} \right)}^{2}{\left( {n - 2} \right)!}\left( {\max \left( {1,\mu} \right)} \right)^{{({n - 2})}/2}}} & (92) \\{{\leq {{n!}\left( {\max \left( {1,\mu} \right)} \right)^{n/2}}},} & (93)\end{matrix}$

which concludes the proof.

4. Experimental Results

FIG. 4 shows exemplary images illustrating advantages obtained with theinvention. FIG. 4A shows an original X-ray measurement of a thoraxphantom with a low dose at 80 kV and 8 mAs source current resulting in ahigh noise level. FIG. 4C shows a corresponding original X-raymeasurement of the thorax phantom with a high dose at 80 kV and 25.6 mAsresulting in a low noise level. If according to FIG. 4B an imagereconstruction is applied to the original high noise image of FIG. 4Ausing the second embodiment of the invention and focussing mainly on theremoval of noise, the quality of the result is comparable to the highdose in FIG. 4C. Furthermore, with a reconstruction of FIG. 4C using theinvention, focussing mainly on the elimination of the detector responsefunction, the level of detail is greatly enhanced without raising thenoise level.

The features of the invention disclosed in the above description, thedrawings and the claims can be of significance both individually as wellas in combination for the realization of the invention in its variousembodiments.

1. An image reconstruction method for reconstructing an image f ^(min)representing a region of investigation within an object, comprising thesteps of: providing detector data y_(i) comprising Poisson random valuesfrom an X-ray transmission measurement using an X-ray source and adetector device, said detector data y_(i) being measured at an i-th of aplurality of different pixel positions of the detector device relativeto the object, and reconstructing the image f ^(min) based on thedetector data y_(i), said reconstructing step including a procedure ofminimizing a functional F(f)${F\left( \underset{\_}{f} \right)} = {{\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {\mu_{i} - {y_{i}\log \; \mu_{i}}} \right)}} + {a{{T^{- 1}\underset{\_}{f}}}_{p}}}$wherein f is a current test image used for minimizing the functionalF(f),$\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {\mu_{i} - {y_{i}\log \; \mu_{i}}} \right)}$is a maximum-likelihood risk functional for Poisson statistics, saidparameters μ_(i) being transmission projections of the test image f,said projections being computed according to Beer-Lambert's law at thei-th pixel position relative to the X-ray source, |T⁻¹ f|_(p) is asparsity enforcing functional including the l_(p) norm of vector T⁻¹ fwith 0≦p<2, said vector T⁻¹ f being a sparse or compressiverepresentation of f in a (bi-) orthogonal basis T, and a is acalibration factor, wherein the image f ^(min) represents the globalminimum of the functional F(f).
 2. The image reconstruction methodaccording to claim 1, wherein the functional F(f) additionally includesan additive regularization function R(f) suppressing artefacts:${F\left( \underset{\_}{f} \right)} = {{\frac{1}{k}{\sum\limits_{i = 1}^{k}\left( {\mu_{i} - {y_{i}\log \; \mu_{i}}} \right)}} + {a{{T^{- 1}\underset{\_}{f}}}_{p}} + {{R\left( \underset{\_}{f} \right)}.}}$3. The image reconstruction method according to claim 1, wherein theX-ray transmission measurement is an X-ray CT imaging of the object, theimage (f_(j) ^(min)) is a volumetric reconstruction of the object, andthe parameters μ_(i) areμ_(i)=Σ_(i′) B′ _(ii′) I ₀exp(−Σ_(j) A _(i′j) f _(i))+r wherein B′_(ii′)is a matrix representing a response function of the detector deviceassigning an i′-th spatial position on the detector surface to the i-thdetector data y_(i), I₀ is an intensity of the unattenuated X ray beam,A_(i′j) is a predetermined system matrix assigning a j-th voxel of theobject to the i′-th spatial position on the detector surface, and r is abackground count parameter of the detector device.
 4. The imagereconstruction method according to claim 1, wherein the X-raytransmission measurement is a planar X-ray imaging of the object, theimage (f_(i′) ^(min)) is a planar reconstruction of an X-ray attenuationimage of the object, the test image f_(i′) is given in terms of a threedimensional test object f_(j) ^(A) asf _(i′) =I ₀exp(−Σ_(j) A _(i′j) f _(j) ^(A)), A_(i′j) is a predeterminedsystem matrix assigning a j-th voxel of the test object to the i′-thspatial position on the detector surface, and the parameters μ_(i) areμ_(i)=Σ_(i′) B′ _(ii′) f _(i′) +r, wherein B′_(ii′) is a matrixrepresenting a response function of the detector device assigning ani′-th spatial position on the detector surface to the i-th detector datay_(i), and r is a background count parameter of the detector device. 5.The image reconstruction method according to claim 5, wherein at leastone of the system matrix A_(i′j) and the response function B′_(ii′) isadjusted depending on measuring system reference data or acquired usinga calibration measurement.
 6. The image reconstruction method accordingto claim 1, wherein the (bi-) orthogonal basis T is one of a basis ofwavelets with compact carrier and an adapted basis depending onproperties of the object to be imaged.
 7. The image reconstructionmethod according to claim 1, wherein the detector data y_(i) areprovided via a data communication channel, from a data storage ordirectly by the detector device.
 8. The image reconstruction methodaccording to claim 1, comprising at least one of the further steps of:storing, recording, displaying and further processing the image f^(min).
 9. An imaging method for creating an image f ^(min) of anobject, comprising the steps of: collecting detector data y_(i) with adetector device of an imaging device, and subjecting the detector datay_(i) to the image reconstruction method according to claim
 1. 10. Animaging device for imaging a region of investigation in an object, theimaging device comprising: an X-ray source arranged for irradiating theobject with X-rays, a detector device for measuring detector data y_(i)comprising Poisson random values measured at a plurality of differentpixel positions of the detector device relative to the object, and areconstruction device for reconstructing an image f ^(min) of theobject, said reconstruction device being adapted for subjecting thedetector data y_(i) to an image reconstruction method according toclaim
 1. 11. A computer program residing on a computer-readable medium,with a program code for carrying out the image reconstruction methodaccording to claim
 1. 12. An apparatus comprising a computer-readablestorage medium containing program instructions for carrying out theimage reconstruction method according to claim
 1. 13. The imagereconstruction method according to claim 3, wherein at least one of thesystem matrix A_(i′j) and the response function B′_(ii′) is adjusteddepending on measuring system reference data or acquired using acalibration measurement.